Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Relational algebraic operation algorithm on compressed data
DING Xinzhe, ZHANG Zhaogong, LI Jianzhong, TAN Long, LIU Yong
Journal of Computer Applications    2016, 36 (1): 21-26.   DOI: 10.11772/j.issn.1001-9081.2016.01.0021
Abstract619)      PDF (923KB)(374)       Save
Since in the massive data management, the compressed data can be done some operations without decompressing first, under the condition of normal distribution, according to features of column data storage, a new compression algorithm which oriented column storage, called CCA (Column Compression Algorithm), was proposed. Firstly, the length of data was classified; secondly, the sampling method was used to get more repetitive prefix; finally the dictionary coding was utilized to compress, meanwhile the Column Index (CI) and Column Reality (CR) were acted as data compression structure to reduce storage requirement of massive data storage, thus the basic relational algebraic operations such as select, project and join were directly and effectively supported. A prototype database system based on CCA, called D-DBMS (Ding-Database Management System), was implemented. The theoretical analyses and the results of experiment on 1 TB data show that the proposed compression algorithm can significantly improve the performance of massive data storage efficiency and data manipulation. Compared to BAP (Bit Address Physical) and TIDC (TupleID Center) method, the compression rate of CCA was improved by 51% and 14%, and its running speed was improved by 47% and 42%.
Reference | Related Articles | Metrics